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(54) IP-based Interactive Voice Response system for servicing calls from a PSTN 



(57) An approach to abstracting the circuit switched 
nature of the public switched telephone network (PSTN) 
by using VoIP to provide voice actuated sen/ices is dis- 
closed. By carrying a telephone call using VoIP technol- 
ogy for a short distance (frequently within a server room) 
significant benefits to call handling and capacity man- 
agement can be obtained. Specifically, a PSTN-to-lP 



gateway is used to receive (and place) calls over the 
PSTN and route those calls Internally to servers over an 
IP network in a packet switched format. A number of 
computer systems can receive and handle the calls in 
the IP fonnat, including: translating the packets into an 
audio fonriat suitable for speech recognition and creat- 
ing suitable packets from computer sound files for trans- 
mission back over the PSTN. 
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Description 

BACKGROUND OF THE INVENTION 
Field of the Invention 

[0001 ] This Invention relates to the field of telephony. 
In particular, the invention relates to technologies for us- 
ing voice over Internet Protocol (VoIP) solutions in a 
number of configurations to increase flexibility and reli- 
ability of call handling systems. 

Description of the Related Art 

[0002] Figure 1 shows an example of the use an effi- 
ciently an-anged prior art system for supporting voice ac- 
tivated services over a telephone interface at element 
130. Figure 1 superimposes that configuration on a high 
level view of such a platform as illustrated by telephone 
100 coupled to the telephone network 104, which is in 
turn coupled to a telephone gateway 107, and a phone 
application platfomn 1 1 0. In one embodiment, the phone 
application platfomn 110 can con^espond to a voice por- 
tal that provides voice activated access to a variety of 
infomfiation including personalized content. Such a plat- 
fomn is described In greater detail in United States Pat- 
ent Application 09/426,1 02 entitled "Method and Appa- 
ratus for Content Personalization over Telephone Inter- 
face." 

[0003] As Figure 1 shows, functionally the interface 
with the telephone networlc 1 04 (e.g. the public switched 
telephone network or PSTN) is conceptually separate 
from the phone application platform 110, in order to 
achieve efficient configurations with traditional telepho- 
ny equipment, the hardware to support those functions 
may not be as cleanly separated as shown in element 
1 30: where there is a physical tenni nation of one or more 
PSTN circuit switched calls, e.g. DS3 line in 112. A sin- 
gle DS3 includes 28 primary rate interfaces (PRIs), each 
including 24 dedicated voice channels for a total of 672 
dedicated voice channels. In order to handle this 
number of calls, the PRIs are multiplexed out using mul- 
tiplexer 114 to a collection of servers with telephony 
cards 116A-Z for handling the PRI and the voice com- 
munications channels therein. In one configuration, a 
set of Dialogic signal cards model numbers D/480SC- 
2TI and Antares^OOOxSO from Dialogic Corporation, 
Parsippany, New Jersey, are use to handle the PRIs. 
[0004] Some inefficiencies result from the preceding 
configuration, for example, in order to readily support 
"tromboning" (connections between an incoming caller 
and one or more parties on an outbound call) the two 
calls need to be handled by the same server 116. Sim- 
ilarly, features like conference calls have similar de- 
pendencies. Accordingly, the telephone network 104 
must be programmed to distribute the voice calls across . 
the PRIs within the DS3 to leave sufficient capacity for 
outbound calling purposes. Further, physical proximity 



between the telephone gateway 1 07 and the phone ap- 
plication platform 1 1 0 Is effectively enforced by the need 
for the servers supporting the phone application plat- 
form 110 to be In sufficient proximity to allow temnination 
of circuit switched calls on those servers. 
[0005] Figure 2 Illustrates prior art uses of Voice over 
Internet Protocol (VoIP) techniques to provide telephony 
services. Prior to VoIP type technologies, a telephone 
call from the telephone 200A to the telephone 200B 
would be carried by a series of circuit switched connec- 
tions from the local telephone network 204A to the long 
distance telephone network 21 0 and on to the local tel- 
ephone network 204B before reaching the telephone 
200B. Some new entrants into the long distance market 
have begun offering lower cost transmission through the 
internet 208, and more generally packet switched net- 
works, using suites of protocols such as voice over In- 
ternet Protocol (VoIP) and gateways such as the VoIP 
gateways 206A-B. Frequently, such new entrants are 
thought of as providing lower quality service than the 
circuit switched network (this is frequently the case due 
to the use of heavy compression as well as transmission 
in a best effort network). Similarly, using VoIP some new 
entrants encourage people to use their computers to 
place voice (as well as video) calls from computer to 
computer, e.g. computer 21 2A to computer 21 28. Some 
services even allow connections from computer, e.g. 
computer 21 2A, to a telephone, e.g. telephone 200A, 
again in the hopes of providing cut rate services since 
the calling party may be able to avoid many taxes and 
surcharges typically imposed on long distance calling 
The prior approaches to providing voice activated serv- 
ices have been focused on the circuit switched orienta- 
tion of the telephone network. Prior packet switched ap- 
proaches for handling voice communications have been 
characterized by an end-to-end philosophy of call place- 
ment. Accordingly, what is needed is a better configura- 
tion for handling receipt and transmission of audio from 
and to the telephone network 104 that provides in- 
creased flexibility while maintaining compatibility with 
the existing telephone network 104 by leveraging VoIP 
standards to provide new services and functions. 

SUMMARY OF THE INVENTION 

[0006] An approach to abstracting the circuit switched 
nature of the public switched telephone network (PSTN) 
by using VoIP to provide voice actuated services is dis- 
closed. By canying a telephone call using VoIP technol- 
ogy for a short distance (frequently within a server room) 
significant benefits to call handling and capacity man- 
agement can be obtained. Specifically, a PSTN-to-lP 
gateway is used to receive (and place) calls over the 
PSTN and route those calls internally to servers over an 
IP network in a packet switched fomnat. A number of 
computer systems can receive and handle the calls in 
the IP fomriat, including: translating the packets into an 
audio fomnat suitable for speech recognition and creat- 
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ing suitable packets from computer sound files for trans- 
mission back over the PSTN. 

[0007] In some embodiments, a proxy server Is used 
to balance call load amongst a pool of server computers 
handling the phone calls as they are passed off from the 
gateway in IP fomri. This may also be used to reduce the 
need to reserve capacity on specific server computers 
based on circuit capacity. For example, In the prior art 
configuration each telephony sen/er readily supported 
only a fixed number of circuits due to the physical con- 
nectivity properties. Thus if a single PRI (23 usable 
phone lines in North America) were connected to a serv- 
er, then to easily support outgoing calls (trombontng), it 
is necessary to reserve capacity on that PRI. In contrast, 
with a packet switched abstraction, the server does not 
have to be concerned with which PRI, DS3, etc., Is han- 
dling the incoming and outgoing legs of the call session 
since the capacity limit is solely based on total packet 
network bandwidth and processor capability on the 
server (both of which are more flexible than circuit ca- 
pacity). Similarly, advanced calling features such as 
conference calling that would have previously required 
reservation of a large number of ports on a single te- 
lephony card and be handled more elegantly. 
[0008] It should be noted that this approach is not nec- 
essarily cost reducing, e.g. the cost of the telephony 
gateway 1 07 and phone application platform 1 1 0 will not 
necessarily be reduced. Rather, and perhaps counter- 
intuitively, costs may go up since the PSTN-to-IP gate- 
way can be rather expensive, especially if purchased in 
redundant pairs. Further, expensive network switches 
and routers to support several thousand uncompressed 
packet fomriat data streams will be necessary as well. 
In contrast, most VoIP installations make use of (heavy) 
compression and expect only best effort delivery of 
packets. The need to perfomn high quality speech rec- 
ognition makes such compression (as well as an unre- 
liable networic) undesirable. 

[0009] Additionally, this situation Is counter-intuitive to 
the general trend in Vol P telephony of establishing many 
points of presence (POPs) throughout the nation to 
avoid long distance charges. Rather, this approach lev- 
erages the PSTN for what it is good at: long haul trans- 
mission of voice data at a fixed quality of service and 
then makes use of VoIP to abstract those details. Tele- 
phone carriers who feel comfortable delivering calls di- 
rectly in VoIP fonnats may be pennitted to temriinate 
their calls as such as well; however, that is not neces- 
sary. 

BRIEF DESCRIPTION OF THE FIGURES 

[001 0] Fig. 1 illustrates a prior art system for support- 
ing vorce activated services over a telephone interface. 
Fig. 2 illustrates prior art uses of Voice over Internet Pro- 
tocol (VoIP) techniques to provide telephony sen/ices. 
Fig. 3 Illustrates a system including an embodiment of 
the invention for supporting votoe activated services 



over a telephone interface. 

Fig. 4 is a process flow diagram for handling a call ac- 
cording to one embodiment of the invention. 

5 DETAILED DESCRIPTION 

A. Introduction 

[0011] The invention will be described in greater detail 
'0 as follows. First, a number of definitions useful to under- 
standing the invention are presented. Then, the hard- 
ware and software architecture for localized voice over 
Internet Protocol (VoIP) usage will be considered. Final- 
ly, the processes and features of the environment are 
15 presented in greater detail. 

B. Definitions 

1 . Telephone Identifying Infomnation 

[0012] For the purposes of this application, the terni 
telephone Identifying information will be used to referto 
ANI information, CID Infonnation, and/or some other 
technique for automatically identifying the source of a 
call and/or other call setup infomnation. For example, tel- 
ephone identifying infomnation may include a dialled 
number identification service (DNIS). Similarly, CID in- 
formation may include text data including the subscrib- 
er's name and/or address, e.g. "Jane Doe". Other ex- 
amples of telephone identifying information might in- 
clude the type of calling phone, e.g. cellular, pay phone, 
and/or hospital phone. 

[0013] Additionally, the telephone identifying informa- 
tion may include wireless carrier specific identifying in- 
formation, e.g. location of wireless phone now, etc. Also, 
signalling system seven (SS7) infonmation may be in- 
cluded In the telephone identifying infonnation. 

2. User Profile 

[0014] A user profile is a collection of infomnation 
about a particular user. The user profile typically In- 
cludes collections of different infonmation of relevance 
to the user, e.g., account number, name, contact infor- 
mation, user-id, default preferences, and the like. Nota- 
bly, the user profile contains a combination of explicitly 
made selections and implicitly made selections, 
[0015] Explicitly made selections in the user profile 
stem from requests by the user to the system. For ex- 
ample, the user might add business news to the main 
topic list. Typically, explicit selections come in the fomn 
of a voice, or touch-tone command, to save a particular 
location, e.g. "Remember this", "Bookmari< iT", "shortcut 
this", pound (#) key touch-tone, etc., or through adjust- 
ments to the user profile made through the web interface 
using a computer. 

[0016] Additionally, the user profile provides a useful 
mechanism for associating telephone Identifying infor- 
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mation with a single user, or entity. For example, Jane 
Doe may have a'home phone, a work phone, a cell ' 
phone, and/or some other telephones. Suitable tele- 
phone identifying infonnation for each of those phones 
can be associated in a single profile for Jane. This allows 5 
the system to provide uniformity of customization to a 
single user, irrespective of where they are calling from. 
[0017] In contrast, implicit selections come about 
through the conduct and behaviour of the user. For ex- 
ample, if the user repeatedly asksforthe weather in Palo io 
Alto, California, the system may automatically provide 
the Palo Alto weather report without further prompting. 
In other embodiments, the user may be prompted to 
confimi the system's implicit choice, e.g. the system 
might prompt the user "Would you like me to include is 
Palo Alto in the standard weather report from now on?" 
[0018] Additionally, the system may allow the user to 
customize the system to meet her/his needs better. For 
example, the user may be allowed to control the verbos- 
ity of prompts, the dialect used, and/or other settings for 20 
the system. These customizatlons can be made either 
explicitly or implicitly. For example if the user is providing 
commands before most prompts are finished, the sys- 
tem could recognize that a less verbose set of prompts 
is needed and implicitly set the user's prompting prefer- 25 
ence to briefer prompts. 

3. Topics and Content 

[0019] Atopic is any collection of similar content. Top- 30 
ics may be arranged hierarchically as well. For example, 
a topic might be business news, while subtopics might 
include stock quotes, market report, and analyst reports. 
Within a topic different types of content are available. 
For example, in the stock quotes subtopic, the content 35 
might include stock quotes. The distinction between top- 
ics and the content within the topics is primarily one of 
degree in that each topic, or subtopic, will usually con- 
tain several pieces of content. 

40 

4. Demographic and Psychographic Profiles 

[0020] Both demographic profiles and psychographic 
profiles contain infomriatlon relating to a user. Demo- 
graphic profiles typically include factual infonnation, e. 45 
g. age, gender, marital status, income, etc. Psycho- 
graphic profiles typically include infomiation about be- 
haviours, e.g. fun loving, analytical, compassionate, fast 
reader, slow reader, etc. As used In this application, the 
term demographic profile will be used to refer to both so 
demographic and psychographic profiles. 



C. VoIP Configuration 

[0021] Figure 3 Illustrates a system Including an em- ss 
bodiment of the invention for supporting voice activated 
services over a telephone interface. The top portion of 
the figure shows the functional components labelled ac- 



cording to the labelling of Figure 1 , e.g. telephone 1 00, 
telephone network 104, telephone gateway 107, and 
phone application platfonn 110. The bottom portion 
shows the new Implementation approach that Is based 
on a VoIP architecture. The implementation compo- 
nents of the telephone gateway 107 are shown in ele- 
ment 320 while the implementation components for a 
portion of the phone application platfomi 1 1 0 are shown 
in element 330. 

[0022] Unlike in the prior art system, there is a clean 
separation between the telephone gateway 107 imple- 
mentation and the phone application platform 110 im- 
plementation. This promotes modularity and improves 
functionality. The telephone gateway 107 Is supported 
by one or more media gateways 302. A media gateway 
is a tenn for products such as Cisco AS5300 from Cisco 
Corporation, San Jose, California, GSX 9000 from 
Soniis Networi<s, Inc.. Westford, Massachussetts. and 
MuttiVoice MAX TNT from Lucent Technologies, Murray 
Hill, New Jersey. More generally the media gateway 302 
is a device for routing circuit switched telephone networic 
calls to a packet switched networic (and vice-versa^) 
Some media gateways may be capable of handling sev- 
eral thousand calls simultaneously. Further, as appro- 
priate, redundant media gateways can be configured to 
interoperate appropriately with the telephone networic 
104. 

[0023] Importantly, to the left of the media gateway 
302 in Figure 3, a telephone call is cam'ed in a circuit 
switched fashion while on the right It Is can-led in a pack- 
et switched fashion. This avoids the problem of estab- 
lished telecommunication carriers who are unprepared 
to provide direct VoIP connections to customers (see, 
e.g. leftside of Figure 2, showing that Vol P carriers start- 
and-temiinate circuit switched calls.) If the telecommu- 
nication carrier supports it, the telephone gateway can 
1 07 can also include facilities for directly receiving VoIP 
calls. 

[0024] Before discussing call completion, consider 
the implementation of the phone application platform 
110. A number of computers, servers 306A-2, can be 
provided together with a session initiation protocol (SIP) 
proxy 304. The servers 306A-Z can be comprised of one 
or more computers, typically of a server, or rack mount 
variety. According to one embodiment, a Network En- 
gine server from Network Engines, Inc., Canton. Mas- 
sachussetts, is used for the servers 306A-Z because it 
is a compact, 1 rack unit (1U) high, yet powerful com- 
puter system. 

[0025] Through the use of one or more (proposed) 
standard Internet Engineering Task Force (IETF) proto- 
cols such as SIP (RFC 2543), the SIP proxy 304 can 
relay Infomriatlon from the media gateway 302 to the 
senders 306A-Z about incoming calls and allow them to 
handle the sessions. The term "proxy" is used to de- 
scribe the SIP proxy 304; however, such use is not in 
strict conformance with the definition in RFC 2543. 
Rather, the SIP proxy 304 may be in the temis of RFC 
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2543 a "proxy", a "proxy server, a "redirect server", a 
"server", and/or some other type of device and/or pro- 
gram for balancing distribution of SIP requests (incom- 
ing calls) across the servers 306A-Z. 
[0026] The call handling flow according to the imple- 
mentation in Figure 3 will now be considered in connec- 
tion with Figure 4. First, at step 400, a call is received 
at the phone number of the phone application platform 
110. For this example, the phone number will be +1 
(800) 555-TELL (565-TELL and 5555TELL are regis- 
tered trade marks of Tellme Networks, Inc.); however, 
other numbers could be used, e.g. international free 
phone numbers +800 5555-TELL, country specific num- 
bers, and non-free numbers, e.g. +1 (650) 555-1212. 
The phone call is received when the circuit switched tel- 
ephone network 104 carries the signal (via a circuit) to 
the telephone gateway 107 (and thus the media gate- 
way 302). 

[0027] Next, at step 402, a SIP request is generated 
(see RFC 2543 generally for format) by the media gate- 
way 302 to the SIP proxy 304. The SIP request can in- 
clude suitable telephone identifying infomiation, e.g. di- 
alled number, calling party number, ANI, etc. The SIP 
proxy 304 will then redirect, proxy, forward, and/or oth- 
erwise cause the request to be passed to one of the 
servers 306A-Z for acknowledgement and handling. Cri- 
teria for distribution amongst the servers may include: 
the telephone identifying information (e.g. some servers 
are reserved for certain calling (or called) parties); serv- 
er load (e.g. evenly distribute workload across the dif- 
ferent servers relative to their capacity to handle calls); 
online/offline status of individual servers; network mon- 
itoring showing faults with one or more servers; and/or 
other criteria selected by the operator of the phone ap- 
plication platfomi 110. 

[0028] For example, according to one embodiment, in 
order to test a new hardware and/or software configu- 
ration of a particular server (e.g. the server 306Z) a pre- 
detennined percentage of calls might be routed to that 
server. Similarly, if a better servers become available 
and are added to the existing pool, the distribution of 
calls could be evenly distribute based on weighted ca- 
pacity. In such a configuration, a server that could han- 
dle 1 00 simultaneous calls versus and eariier server that 
only handled 50 would be considered equally loaded 
based on the ratio of number of cun^ent calls to capacity, 
e.g. 5 on the older server, and 10 on the newer server 
are equivalent: 5/50 = 1/10 = 10/100. 
[0029] Note that this sort of flexible load balancing is 
not readily possible with the prior art configuration of Fig- 
ure 1 since call handling capacity is a direct function of 
terminated circuits (e.g. numberofPRIs). Thus, the prior 
art servers 116 cannot as easily take advantage of im- 
provements in processing power without replacing the 
physical telephony hardware to support higher density 
circuit termination. 

[0030] In some embodiments, the functionality of the 
SIP proxy 304 can be subsumed in whole or in part into 



the media gateway .302. The ability to do this will depend 
in large part on the monitoring and routing capabilities 
of the particular media gateway 302. 
[0031 ] Next, at step 404, the SIP request is acknowl- 

5 edge by the selected server 306A-Z. At that point, the 
data (e.g. voice channel, or stream) flows between the 
sender, the media gateway, and the telephone network 
104. The data portion can be sent using one or more 
standard Ipternational Telecommunication Union (ITU) 

10 and/or IETF protocols, e.g. RTSP, RTP, Q.931 , etc. 
[0032] In one embodiment, compression of the 
stream is intentionally disabled between the media gate- 
way 302 and the servers 306A-Z. Typical, VoIP data 
transmissions use (heavy) compression to reduce 

IS bandwidth demands; however, such compression could 
severely reduce the quality of speech recognition results 
and thus is not used. While the lack of compression 
would be undesirable In many other VoIP environments 
due to high bandwidth consumption for thousands of 

20 VoIP streams, the operator of the phone application plat- 
fomn need only provide high bandwidth in between the 
media gateway 302 and the servers 306 (frequently onty 
a short distance, e.g. within a server room, etc.) 
[0033] Lastly, at step 406, the servers communicate 

25 with the media gateway using SIP requests to control 
handling of the session (call). Unlike the servers with 
telephony cards 11 6A-Z of Figure 1 , the servers 306A- 
Z cannot directly control handling of the circuit switched 
line. (Recall that in the configuration of Figure 1 , one or 

30 more circuit switched PRIs temriinated at each server 
with telephony cards 116A-Z and. the telephony cards 
could directly control the circuit, e.g. the call.) Instead, 
to control call handling features (e.g. request termina- 
tion of the call) or other special features (e.g., the com- 

35 nnunication may be to redirect an RTP media stream(s) 
to accomplish tromboning Independent of the server 
306A-Z), one or more appropriate messages can be 
sent according to the SIP protocol. 
[0034] As an example, if the initial caller to the phone 

40 application platform 110 requests an outbound call 
transfer (e.g. place a call to a third party), one or more 
SIP requests could be generated by the servers 306A- 
Z to the media gateway 302 (possibly via the SIP proxy 
304) to cause the initiation of the call. For example, to 

<5 contact a restaurant, the server could request a call 
placement to the phone number of the restaurant be 
added to the in progress session between the Initial call- 
er and the server. The media gateway 302 and/or the 
SIP proxy 304 could respond to this request by (ulti- 
mo mately) opening circuit switched connections back over 
the telephone network 1 04 to the restaurant. Notice, im- 
portantly, that there is no longer a need to reserve cir- 
cuits on any particular line or Interface. 
[0035] Thus, despite only using the VoIP technologies 

55 in the last "100 metres" or so, e.g. within a server room, 
some significant functionality becomes available that al- 
so serves to increase flexibility: easier multi-party fea- 
tures and elimination of reserved circuit capacity. In one 
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embodiment, VoIP can be viewed as providing an ab- 
straction layer to the circuit switched network. 
[0036] In United States Patent Application 
09/426,102, entitled "Method and Apparatus for Content 
Personalization Over a Telephone Interface", having in- 
ventors Hadi Partovi, et. al., afunctional decomposition 
of a phone application platfonn substantially similar to 
the instant phone application platform 1 1 0 is presented. 
According to that functional mode!, the servers 306A-Z 
could provide a subset of the Identified functions such 
as call management, execution, evaluation, data con- 
nectivity, and/or streaming. The specific functions pro- 
vided by the servers 306A-Z will depend on their 
processing power, capacity, and number. For example, 
in the prior art arrangement of Figure 1 , the servers with 
telephone cards 116A-Z could only handle a specific 
number of calls as detemiined by the physical connec- 
tivity of the boxes to the telephone network. In contrast, 
the number of calls handled by the servers 306A-Z can 
be adjusted for their processing power, cun'ent load, an 
operator-imposed cap (e.g. no more than N calls per 
server with an eye towards a specific quality of service), 
and/or other criteria. In a prefen'ed embodiment, servers 
306A-Z each include a VoiceXML interpreter so that 
they may be programmed to perfonn a wide variety of 
call handling tasks. VoiceXML (or Voice extensible 
Martcup Language) is the name of a programming lan- 
guage promulgated by the VoiceXI\/IL Forum (an indus- 
try fomm founded by AT&T, IBM, Lucent and Motorola) 
for designing and creating audio dialogues that include, 
inter alia, synthesized speech, voice-recognition, 
streaming audio and DTMF input. 
[0037] In one embodiment, the SIP proxy 304 distrib- 
utes load evenly across the servers 306A-Z and moni- 
tors their load through one or more communication 
channels, e.g. periodic queries to the servers 306A-Z. 
If the number of calls at a given time exceeds a prede- 
termined threshold, one or more messages maybe gen- 
erated by the SIP proxy 304 (or one of the servers 306A- 
Z) to instruct the media gateway 302. The message 
might Indicate that no more calls should be taken, e.g. 
busy the line. Or more specifically, when the servers 
306A-Z are handling calls from multiple legal entities, 
the message might more specifically stop the accept- 
ance of calls for one legal entity (e.g. by dialled phone 
number) in accordance with one or more limits (e.g. con- 
tracts, fairness (everyone has to have capacity for at 
least Xcalls), etc.). Responsive to such a message, the 
media gateway 302 may send one or more messages 
over the PSTN, e.g. using signalling system 7 (SS7) or 
such other protocols as may be available. The result, 
calls to a first number, +1 (800) 555-TELL might be able 
to proceed while calls to +1 (800) PAR-TNER might re- 
ceive a busy signal or some other network status mes- 
sage, e.g. "All circuits are busy". 
[0038] The above type of differentiated and targeted 
service control is not readily possible in the circuit 
switched configuration of Figure 1 because of the lack 



of cross-communication between the servers with te- 
lephony cards 1 1 6A-Z and the lack of a centralized com- 
munication with the switching systems of the telephone 
network 104. 

[0039] In the case where the connectivity between the 
media gateway 302 and the telephone network 104 
does not easily support low level communication to al- 
low the media gateway 302 to control the behaviour of 
the telephone network 104, the media gateway 302 can 
send SIP requests to a special destination, e.g. an extra 
server of substantially the same type as the servers 
306A-Z to cause a message to be played and then ter- 
minate the call. In other embodiments, if the media gate- 
way 302 supports the capability, it can generate and play 
back a busy message for specific numbers at specific 
times. 

[0040] Returning to the prior art arrangement of Fig- 
ure 1, the telephony cards in the sen/ers 116A-Z typi- 
cally included digital signal processors (DSPs) for 
processing the audio and assisting in a variety of ways 
with voice recognition. For example, the Nuance speech 
recognition system from Nuance Corporation, Mountafn 
View, California, comes configured to support Dialogic 
telephony cards with certain features occurring on the 
card. Simllariy, the audio providers (the software for 
working with the hardware cards to get/send audio) are 
configured in many instances to make use of the DSPs 
on the telephony cards. Those software audio providers 
accordingly have to be re-written according to the 
present invention to rely on the processor(s) in the serv- 
er 306A-Z to send and get requests to/from networic 
packets in a suitable VoIP data transmission format (as 
negotiated using SIP) and/or to generate/manage addi- 
tional SIP requests. Specific functions include decoding 
received network packets containing audio data and 
preparing them for voice recognition processing, includ- 
ing: echo cancellation, noise filtering, end pointing, and 
speech recognition. Other functions of the audio provid- 
er include taking sounds such as streaining audio and 
other audio files and converting them into network pack- 
ets according to the data transmission forniat 
[0041] Additional protocols may be used in conjunc- 
tion with SIP to further support the VoIP an^angement 
disclosed. For example, the PINT protocol of RFC 2848 
may be used to communicate out from the phone appli- 
cation platfonn 1 1 0 to the circuit switched telephone net- 
work 104 for one or more purposes, e.g. for outbound 
call notification. 



[0042] According to some embodiments of the inven- 
tion, one or more additional computers can be coupled 
In communication with the phone application platfonn 
110, e.g. configuration server 310 (shown as part of 
phone application platfonn 1 1 0)..The configuration serv- 
er 310 is designed to allow easy setup of the servers 
306A-Z,the SIPproxy 304, and/or other computers pro- 
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viding the phone application platfomn. Configuration 
server 310 typically includes host descriptions (i.e., the 
software configuration that is mapped to each respec- 
tive server 306A-Z) and a sen/ice map (i.e., information 
that identifies how the set of servers 306A-Z are as- 
signed in order to maintain an operational phone plat- 
fomi 110). 

[0043] The configuration server 31 0 can leverage ex- 
isting protocols that are available within the respective 
computers to offer these features. As a result, given a 
unique identifier for a machine such as a hardware Eth- 
ernet address, aka media access control (iVIAC) ad- 
dress, a processor sen'al number, a stored value (e.g. 
hostname and/or Internet protocol (IP) address), and/or 
some other unique identifier, machines can be automat- 
ically configured with the necessary software. 
[0044] This process is refen'ed to as "blasting" or 
"jumpstarling" and is different from, but complimentary 
to, network booting and dynamic host configuration pro- 
tocol (DHCP). More specifically, the blasting process 
creates a working system image on the blasted compu- 
ter together with ail appropriate software. 
[0045] For example, if the server 306A were being re- 
purposed from perfomning speech recognition to handle 
telephony, an entry on the configuration server 310 for 
the server 306A could be modified to indicate the new 
machine purpose. Then using a net boot (orfloppy boot) 
the machine could load an image from the configuration 
server 31 0 that causes the machine to be configured to 
behave in the new purpose. For example, the hard drive 
might be re-partitloned, a new operating system loaded 
(WindowsfTM) NT to Solaris(TM) or FreeBSD), soft- 
ware removed or installed (SIP server and audio provid- 
ers installed while speech recognition packages re- 
moved), etc. 

[0046] The bottom line: minimal (or no) human inter- 
vention once the machine's entry in the configuration 
server 310 is updated, hence the respective configura- 
tions of servers 306A-Z are effectively "slaved" to the 
corresponding entries in configuration server 310. De- 
ployment of configuration server 31 0 provides a number 
of other benefits, inter alia: (I) automated software (re) 
configuration and updates for extant or replacement 
servers 306A-Z; (ii) automated management, assign- 
ment, re-assignment, and control of system resources 
via configuration server 31 0; and (iii) automated system 
monitoring, Inventory tracking, auditing, and alamiing 
(in the event of errors or failures). According to one em- 
bodiment of the invention, the configuration server 31 0 
includes appropriate Images of operating systems, soft- 
ware, and/or configuration files for the full range of com- 
puters used by the phone application platfomi 110. Ad- 
ditionally, a database (or table) showing con^espondenc- 
es between a unique identifier for each computer and 
configuration options 



E. Conclusion 

[0047] By abstracting the circuit switched nature of 
the broader telephone network in the last 1 00 or so me- 

5 tres, e.g. within a server room, surprising benefits can 
result as described above. Further, these benefits out- 
weigh the sometimes higher costs of such an arrange- 
ment due to the need for expensive equipment (e.g. me- 
dia gateways) and high bandwidth packet based routing 

10 and switching fabrics between the media gateways and 
the servers. 

[0048] Accordingly, a method and apparatus for using 
voice over Internet Protocol (VoIP) technologies in a lo- 
calized fashion has been described. The approach al- 

15 lows improved capacity and flexibility in providing voice 
activated services. Further, the approach has several 
natural extensions such as internally routing calls in 
VoIP fomnat to remote servers e.g. for overflow to a re- 
mote data centre from the location of the servers 306 A- 

20 2. Similarly, if costs for using the packet switched net- 
work are sufficiently cheaper than the circuit switched 
telephone network 104, $ome outbound calls could be 
placed using outbound calling through a VoIP earner (e. 
g. by directing the media gateway 302 to route outbound 

25 calls using VoIP to a VoIP gateway belonging to a tele- 
communications carrier or one belonging to the operator 
of the phone application platfonn 110.) 
[0049] In some embodiments, phone application plat- 
form 110 and the development platform web server 1 08 

30 can be hardware based, software based, or a combina- 
tion of the two. In some embodiments, phone application 
platfomn 1 1 0 is comprised of one or more computer pro- 
grams that are included in one or more computer usable 
media such as CD-ROMs, floppy disks, or other media. 

55 In some embodiments, audio providers, SIP servers, 
SIP clients, SIP proxies, and/or some other type of SIP 
program, are included in one or more computer usable 
media. 

[0050] Some embodiments of the invention are In- ■ 
40 eluded in an electromagnetic wave forni.- The electro- 
magnetic waveform comprises infonnation such as au- 
dio providers, SIP servers, SIP clients, SIP proxies, and/ 
orsome othertype of SIP program. The electromagnetic 
waveform may include the programs accessed over a 
^ network. 

[0051] The foregoing description of various embodi- 
ments of the invention has been presented for purposes 
of illustration and description. It is not intended to limit 
the invention to the precise fomns disclosed. Many mod- 
50 iftoations and equivalent arrangements will be apparent. 



Claims 

55 1 . A computerized, Internet protocol (IP) based voice 
response system for servicing a call received over 
a publte switched telephone networic (PSTN) com- 
prising: 
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a PSTN-to-IP gateway for connecting to the 
public switched telephone network; 
an IP network medium connected to the gate- 
way; and 

a network server In communication with the net- 
work medium for automated Interaction with a 
user participating in the call. 

2. The voice response system of Claim 1 , wherein the 
network server comprises a host computer for exe- 
cuting a voice application program, a grammar da- 
tabase corresponding to a set of recognizable utter- 
ances, and a voice recognition engine for compar- 
ing a speech input from the user against the set of 
recognizable utterances. 

3. The voice response system of claim 2, wherein the 
voice application program is a VoiceXML program. 

4. The voice response system of Claim 2 or 3, further 
comprising a firewall in communication with the net- 
work medium for connecting the network server to 
an external IP network through the firewall, wherein 
the voice application program is remotely hosted on 
the external IP network. 

5. The voice response system of Claim 2, 3 or 4, 
wherein the network server performs call control 
communications with the PSTN-to-IP gateway In 
accordance with a SIP protocol. 

6. A scalable, computerized, internet protocol (IP) 
based voice response system for servicing a plural- 
ity of calls received over a public switched tele- 
phone network (PSTN) comprising: 

a PSTN-to-IP gateway for connecting to the 
public switched telephone network; 
an IP networic medium connected to the gate- 
way; 

a plurality of networi< server in communication 
with the network medium for automated inter- 
action with a set of users participating in the plu- 
rality of calls; and 

a proxy server in communication with the 
PSTN-to-IP gateway for load balancing the plu- 
rality of calls amongst the plurality of network 
servers. 

7. The voice response system of Claim 6, wherein 
each network server of the plurality of network serv- 
ers comprises a host computer having a distinct net- 
woric identification number. 

8. The voice response system of Claim 7, further com- 
prising a configuration server for automatically load- 
ing and configuring an Initial software environment 
for the host computer during its initial bootup se- 



quence based upon the network identification 
number. 

9. A method of using voice over Internet protocols 
5 (Vol P) to handle circuit switched calls in a voice ac- 
tivated system, the method comprising: 

terminating a circuit switched call at a conver- 
sion device that translates the circuit switched 
10 call Into a VoIP fonnat as a packet switched call; 

fonvarding the packet switched call In the VoIP 
format from the conversion device to a compu- 
ter system; and 

perfonning speech recognition on the call using 
'5 audio data extracted from the VoIP format by 

the computer system, 

10. The method of Claim 9, wherein the conversion de- 
vice and the computer system are located in close 

20 physical proximity. 

11. The method of Claim 9, wherein there is a seconcJ 
computer system physically distant from the con- 
version device and wherein the fonvarding goes to 

25 the second computer system responsive to a failure 
of the first computer system. 

12. The method of Claim 9, 10 or 11 , further comprising 
prior to the forwarding sending a message from the ^ 

30 conversion device to a second computer system, 
the second computer system selecting the compu- 
ter system from a plurality of computer systems to 
receive the call. 

35 13. The method of Claim 12, wherein the selecting ac- 
cording to a predetennined set of criteria to balance 
number of calls being handled by each of the plu- 
rality of computer systems. 

^0 14. The method of Claim 12 or 13, wherein the message 
comprises a session initiation protocol (SIP) re- 
quest. 

15. The method of Claim 12, 13 or 14, wherein the for- 
^5 warding occurs responsive to a SIP acknowledge- 
ment from the computer system. 
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